The introduction of relevant physical information into neural network architectures has become a widely used and successful strategy for improving their performance. In lattice gauge theories, such information can be identified with gauge symmetries, which are incorporated into the network layers of our recently proposed Lattice Gauge Equivariant Convolutional Neural Networks (L-CNNs). L-CNNs can generalize better to differently sized lattices than traditional neural networks and are by construction equivariant under lattice gauge transformations. In these proceedings, we present our progress on possible applications of L-CNNs to Wilson flow or continuous normalizing flow. Our methods are based on neural ordinary differential equations which allow us to modify link configurations in a gauge equivariant manner. For simplicity, we focus on simple toy models to test these ideas in practice.
translated by 谷歌翻译
高能物理和晶格田理论的潜在对称发挥的至关重要作用要求在应用于所考虑的物理系统的神经网络架构中实施此类对称性。在这些程序中,我们专注于在网络属性之间纳入翻译成价的后果,特别是在性能和​​泛化方面。通过研究复杂的标量场理论,举例说明了等级网络的益处,其中检查了各种回归和分类任务。对于有意义的比较,通过系统搜索识别有前途的等效和非等效架构。结果表明,在大多数任务中,我们最好的设备架构可以明显更好地表现和概括,这不仅适用于超出培训集中所示的物理参数,还适用于不同的晶格尺寸。
translated by 谷歌翻译
近年来,在格子田地理论的背景下,使用机器学习越来越受欢迎。这些理论的基本要素由对称表示,其包含在神经网络属性中可以在性能和概括性方面导致高奖励。通常在具有周期性边界条件的晶格上表征物理系统的基本对称性是在空间翻译下的增义。在这里,我们调查采用翻译成分的神经网络,以支持非等价的优势。我们考虑的系统是一个复杂的标量字段,其在磁通表示中的二维格子上的四分之一交互,网络在其上执行各种回归和分类任务。有前途的等效和非成型架构被识别有系统搜索。我们证明,在大多数这些任务中,我们最好的体现架构可以比其非等效对应物更好地表现和概括,这不仅适用于训练集中所示的物理参数,还适用于不同的格子尺寸。
translated by 谷歌翻译
在这些诉讼中,我们呈现了格子仪表的卷积神经网络(L-CNNS),其能够从格子仪表理论模拟处理数据,同时完全保留仪表对称性。我们审查了架构的各个方面,并展示了L-CNNS如何代表晶格上的大类仪表不变性和设备的等效功能。我们使用非线性回归问题进行比较L-CNN和非等效网络的性能,并展示用于非等级模型的仪表不变性如何破坏。
translated by 谷歌翻译
我们审查了一种名为晶格计的新颖的神经网络架构,称为格子仪表的卷积神经网络(L-CNNS),可以应用于格子仪表理论中的通用机器学习问题,同时完全保留了规格对称性。我们讨论了用于明确构建规格的规范的衡量标准的概念,该卷大式卷积层和双线性层。使用看似简单的非线性回归任务比较L-CNNS和非成型CNN的性能,其中L-CNNS在与其非成型对应物相比,L-CNNS展示了概括性并在预测中实现了高度的准确性。
translated by 谷歌翻译
我们为晶格计上的普通机器学习应用提出了格子仪表的卷积卷积神经网络(L-CNNS)。在该网络结构的核心,是一种新颖的卷积层,其保留了规范设备,同时在连续的双线性层形成任意形状的威尔逊环。与拓扑信息一起,例如来自Polyakov环路,这样的网络原则上可以近似晶格上的任何仪表协调功能。我们展示了L-CNN可以学习和概括仪表不变的数量,传统的卷积神经网络无法找到。
translated by 谷歌翻译
Computational units in artificial neural networks follow a simplified model of biological neurons. In the biological model, the output signal of a neuron runs down the axon, splits following the many branches at its end, and passes identically to all the downward neurons of the network. Each of the downward neurons will use their copy of this signal as one of many inputs dendrites, integrate them all and fire an output, if above some threshold. In the artificial neural network, this translates to the fact that the nonlinear filtering of the signal is performed in the upward neuron, meaning that in practice the same activation is shared between all the downward neurons that use that signal as their input. Dendrites thus play a passive role. We propose a slightly more complex model for the biological neuron, where dendrites play an active role: the activation in the output of the upward neuron becomes optional, and instead the signals going through each dendrite undergo independent nonlinear filterings, before the linear combination. We implement this new model into a ReLU computational unit and discuss its biological plausibility. We compare this new computational unit with the standard one and describe it from a geometrical point of view. We provide a Keras implementation of this unit into fully connected and convolutional layers and estimate their FLOPs and weights change. We then use these layers in ResNet architectures on CIFAR-10, CIFAR-100, Imagenette, and Imagewoof, obtaining performance improvements over standard ResNets up to 1.73%. Finally, we prove a universal representation theorem for continuous functions on compact sets and show that this new unit has more representational power than its standard counterpart.
translated by 谷歌翻译
Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset will be made available upon acceptance.
translated by 谷歌翻译
Uncertainty quantification is crucial to inverse problems, as it could provide decision-makers with valuable information about the inversion results. For example, seismic inversion is a notoriously ill-posed inverse problem due to the band-limited and noisy nature of seismic data. It is therefore of paramount importance to quantify the uncertainties associated to the inversion process to ease the subsequent interpretation and decision making processes. Within this framework of reference, sampling from a target posterior provides a fundamental approach to quantifying the uncertainty in seismic inversion. However, selecting appropriate prior information in a probabilistic inversion is crucial, yet non-trivial, as it influences the ability of a sampling-based inference in providing geological realism in the posterior samples. To overcome such limitations, we present a regularized variational inference framework that performs posterior inference by implicitly regularizing the Kullback-Leibler divergence loss with a CNN-based denoiser by means of the Plug-and-Play methods. We call this new algorithm Plug-and-Play Stein Variational Gradient Descent (PnP-SVGD) and demonstrate its ability in producing high-resolution, trustworthy samples representative of the subsurface structures, which we argue could be used for post-inference tasks such as reservoir modelling and history matching. To validate the proposed method, numerical tests are performed on both synthetic and field post-stack seismic data.
translated by 谷歌翻译
Explainability is a vibrant research topic in the artificial intelligence community, with growing interest across methods and domains. Much has been written about the topic, yet explainability still lacks shared terminology and a framework capable of providing structural soundness to explanations. In our work, we address these issues by proposing a novel definition of explanation that is a synthesis of what can be found in the literature. We recognize that explanations are not atomic but the product of evidence stemming from the model and its input-output and the human interpretation of this evidence. Furthermore, we fit explanations into the properties of faithfulness (i.e., the explanation being a true description of the model's decision-making) and plausibility (i.e., how much the explanation looks convincing to the user). Using our proposed theoretical framework simplifies how these properties are ope rationalized and provide new insight into common explanation methods that we analyze as case studies.
translated by 谷歌翻译